Automatic, model-based detection of pause-less phrase boundaries from fundamental frequency and duration features

نویسندگان

  • Mahsa Sadat Elyasi Langarani
  • Jan P. H. van Santen
چکیده

Prosodic phrase boundaries (PBs) are a key aspect of spoken communication. In automatic PB detection, it is common to use local acoustic features, textual features, or a combination of both. Most approaches – regardless of features used – succeed in detecting major PBs (break score “4” in ToBI annotation, typically involving a pause) while detection of intermediate PBs (break score “3” in ToBI annotation) is still challenging. In this study we investigate the detection of intermediate, “pauseless” PBs using prosodic models, using a new corpus characterized by strong prosodic dynamics and an existing (CMU) corpus. We show how using duration and fundamental frequency modeling can improve detection of these PBs, as measured by the F1 score, compared to Festival, which only uses textual features to detect PBs. We believe that this study contributes to our understanding of the prosody of phrase breaks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Segmentation of Continuous Speech on Word and Phrase Level based on Suprasegmental Features

This article investigates whether it is possible to segment continuous speech on word and phrasal level by examination of suprasegmental parameters, in case of bound stress languages like Hungarian and Finnish. The final aim is to increase the robustness of speech recognition on language modelling level by the detection of word and phrase boundaries and so we can significantly decrease the sear...

متن کامل

Using Prosody for Automatic Sentence Segmentation of Multi-party Meetings

We explore the use of prosodic features beyond pauses, including duration, pitch, and energy features, for automatic sentence segmentation of ICSI meeting data. We examine two different approaches to boundary classification: score-level combination of independent language and prosodic models using HMMs, and feature-level combination of models using a boosting-based method (BoosTexter). We repor...

متن کامل

Accent type recognition and syntactic boundary detection of Japanese using statistical modeling of moraic transitions of fundamental frequency contours

Experiments on accent type recognition and syntactic boundary detection of Japanese speech were conducted based on the statistical modeling of voice fundamental frequency contours formerly proposed by the authors. In the proposed modeling, fundamental frequency contours are segmented into moraic units to generate moraic contours, which are further represented by discrete codes. After modeling t...

متن کامل

Modeling of sentence-medial pauses in bangla readout speech: occurrence and duration

Control of pause occurrence and duration is an important issue for text-to-speech synthesis systems. In text-readout speech, pauses occur unconditionally at sentence boundaries and with high probability at major syntactic boundaries such as clause boundaries, but more or less arbitrarily at minor syntactic boundaries. Pause duration tends to be longer at the end of a longer syntactic unit. A de...

متن کامل

Use of Prosodic Features in Speech Recognition

Two methods were proposed for the use of prosodic features in speech recognition: one to detect major syntactic (phrase) boundaries as the initial phase of speech recognition, and the other to check the feasibility of the results of ordinary recognition process from the viewpoint of prosodic features. In the rst method, fundamental frequency contours were assumed as waveforms as functions of ti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016